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Abstract 

A wiretap protocol is a pair of randomized encoding and decoding functions such that knowl- 
edge of a bounded fraction of the encoding of a message reveals essentially no information about 
the message, while knowledge of the entire encoding reveals the message using the decoder. In 
this paper we study the notion of efficiently invertible extractors and show that a wiretap pro- 
tocol can be constructed from such an extractor. We will then construct invertible extractors 
for symbol-fixing, affine, and general sources and apply them to create wiretap protocols with 
asymptotically optimal trade-offs between their rate (ratio of the length of the message versus 
its encoding) and resilience (ratio of the observed positions of the encoding and the length of 
the encoding). We will then apply our results to create wiretap protocols for challenging com- 
munication problems, such as active intruders who change portions of the encoding, network 
coding, and intruders observing arbitrary boolean functions of the encoding. 
Keywords: Wiretap Channel, Extractors, Network Coding, Active Intrusion, Exposure Resilient 
^ Cryptography. 

^ 1 Introduction 

> 

Suppose that Alice wants to send a message to Bob through a communication channel, and that the 
message is partially observable by an intruder. This scenario arises in various practical situations. 
For instance, in a packet network, the sequence transmitted by Alice through the channel can be 
fragmented into small packets at the source and/or along the way and different packets might be 
routed through different paths in the network in which an intruder may have compromised some 
of the intermediate routers. An example that is similar in spirit is furnished by transmission of a 
piece of information from multiple senders to one receiver, across different delivery media, such as 
satellite, wireless, and/or wired networks. Due to limited resources, a potential intruder may be 
^ able to observe only a fraction of the lines of transmission, and hence only partially observe the 

message. As another example, one can consider secure storage of data on a distributed medium 
that is physically accessible in parts by an intruder, or a sensitive file on a hard drive that is erased 
from the file system but is only partially overwritten with new or random information, and hence, 
is partially exposed to a malicious party. 

An obvious approach to solve this problem is to use a secret key to encrypt the information 
at the source. However, almost all practical cryptographic techniques are shown to be secure only 
under unproven hardness assumptions and the assumption that the intruder possesses bounded 
computational power. This might be undesirable in certain situations. In the problem we consider, 
we assume the intruder to be information theoretically limited, and our goal will be to employ this 
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limitation and construct a protocol that provides unconditional, information-theoretic security, 
even in the presence of a computationally unbounded adversary. 

The problem described above was first formalized by Wyner [50 and subsequently by Ozarow 
and Wyner [36] as an information-theoretic problem. In its most basic setting, this problem is known 
as the wiretap II problem (the description given here follows from [36 )Q Consider a communication 
system with a source which outputs a sequence X = (X\, . . . , X m ) in F™ uniformly at random. A 
randomized algorithm, called the encoder, maps the output of the source to a binary string Y £ F?;. 
The output of the encoder is then sent through a noiseless channel (called the direct channel) and 
is eventually delivered to a decode^ D which maps Y back to X. Along the way, an intruder 
arbitrarily picks a subset S C [n] of size t < n, and is allowed to observe Z := Y\$ (through a 
so-called wiretap channel), i.e., Y on the coordinate positions corresponding to the set S. The goal 
is to make sure that the intruder learns as little as possible about X, regardless of the choice of S. 
The security of the system is defined by the conditional entropy A := ming. \s\=t 

H(X\Z). When 

A = to, the intruder obtains no information about the transmitted message and we have perfect 
privacy in the system. Moreover, when A — > to as m — > oo, we call the system asymptotically 
perfectly private. 

Remark 1. The assumption that X is sampled from an i.i.d. and uniform random source should 
not be confused with the fact that Alice is transmitting one particular message to Bob that is 
fixed and known to her before the transmission. In this case, the randomness of X in the model 
captures the a priori uncertainty about X for the outside world, and in particular the intruder, but 
not the transmitter. As an intuitive example, suppose that a random key is agreed upon between 
Alice and a trusted third party, and now Alice wishes to securely send her particular key to Bob 
over a wiretapped channel. Or, assume that Alice wishes to send an audio stream to Bob that is 
encoded and compressed using a conventional audio encoding method. Furthermore, the particular 
choice of the distribution on X as a uniformly random sequence will cause no loss of generality. 
If the distribution of X is publicly known to be non-uniform, the transmitter can use a suitable 
source-coding scheme to compress the source to its entropy prior to the transmission, and ensure 
that from the intruder's point of view, X is uniformly distributed. On the other hand, it is also easy 
to see that if a protocol achieves perfect privacy under uniform message distribution, it achieves 
perfect privacy under any other distribution as well. 



1.1 Our Model 

The model that we will be considering is motivated by the original wiretap channel problem but 
is more stringent in terms of its security requirements. In particular, instead of using Shannon 
entropy as a measure of uncertainty, we will rely on statistical indistinguishability which is a 
stronger measure that is more widely used in cryptography. 



1 We will use the following notation: We denote the set {1, . . . ,n} by [n]. For a vector x — (xi, . . . ,x n ) and a 
subset SC [n], we denote by x\s the projection of x onto the coordinate positions given by elements of S. If X and 
Y are random variables, then X\(Y — y) is the random variable X conditioned on the event Y = y. We use Us to 
denote the uniform distribution on the set S, U n as a short-hand for Wfj , and X ~ Us to denote a random variable 
with distribution Us- Two distributions A and B are called e-close, in symbols A ~ e B, if their statistical distance is 
at most e. All the necessary background for this work is presented in detail in Appendix jA" 

2 Ozarow and Wyner also consider the case in which the decoder errs with negligible pro : 
to consider only error-free decoders. 



sability, but we are going 
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Definition 2. Let £ be a set of size q, m and n be positive integers, and e, 7 > 0. A (t, e, 7) 9 -resilient 
wiretap protocol of block length n and message length m is a pair of functions E : T, m x F£ — > E" 
(the encoder) and -D : S n — > S m (the decoder) that are computable in time polynomial in m, such 
that 

(a) (Decodability) For all x G S m and all z G F£ we have D{E{x, z)) = x, 

(b) (Resiliency) Let A 7 " ~ R ~ and y = E(X,R). For a set S 1 C [n] and tu G X |51 , 
let denote the distribution of X conditioned on the event Y\s = w. Define the set of 
bad observations as Bs ■= {w G E'^l | dist^s^Wgm) > e}> Then we require that for every 
S 1 C [n] of size at most t, Pr[y|s G B5] < 7. 

The encoding of a vector x G S fc is accomplished by choosing a vector Z G F£ uniformly at 
random, and calculating i?(x, Z). The quantities R = m/n, e, and 7 are called the rate, the error, 
and the leakage of the protocol, respectively. By a slight abuse of notation, we call 5 = t/n the 
(relative) resilience of the protocol. 

In our definition the imperfection of the protocol is captured by the two parameters e and 7. 
When e = 7 = 0, the above definition coincides with the original wiretap channel problem for 
the case of perfect privacy. When 7 = 0, we will have a worst-case guarantee, namely, that the 
intruder's views of the message before and after his observation are statistically close, regardless 
of the outcome of the observation. When 7 > 0, a particular observation might potentially reveal 
to the intruder a lot of information about the message. However, a negligible 7 will ensure that 
such a bad event (or leakage) happens only with negligible probability. All the constructions in this 



paper achieve zero leakage (i.e., 7 = 0), except for the general result in subsection 5.3 for which a 
nonzero leakage is inevitable. The significance of zero-leakage protocols is that they assure adaptive 
resiliency in the weak sense introduced in [13] for exposure-resilient functions: if the intruder is 
given the encoded sequence as an oracle that he can adaptively query at up to t coordinates and 
is afterwards presented with a challenge which is either the original message or an independent 
uniformly chosen random string, he will not be able to distinguish between the two cases. 

In general, it is straightforward to verify that our model can be used to solve the original 



wiretap II problem, with A > mil — e — 7). This is proved in Appendix B.l Hence, we will achieve 
asymptotically perfect privacy when e + 7 = o(l/m). For all the protocols that we present in this 
paper this quantity will be superpolynomially small. 



1.2 Related Notions in Cryptography 

There are several interrelated notions in the literature on Cryptography and Theoretical Computer 
Science that are also closely related to our definition of the wiretap protocol. These are resilient 
functions (RF) and almost perfect resilient functions (APRF), exposure-resilient functions (ERF), 
and all-or-nothing transforms (AONT) (cf. P HE! [39J HU El [27] and [TJ] for a comprehensive 
account of several important results in this area). 

The notion of resilient functions was introduced in [3] (and also [19] as the bit- extraction prob- 
lem). A deterministic polynomial-time computable function /: Fg — > F™ is called t-resilient if 
whenever any t bits of the its input are arbitrarily chosen by an adversary and the rest of the bits 
are chosen uniformly at random, then the output distribution of the function is (close to) uniform. 
ERFs, introduced in [5], are similar to resilient functions except that the entire input is chosen 
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uniformly at random, and the view of the adversary from the output remains (close to) uniform 
even after observing any t input bits of his choice. ERFs and resilient functions are known to be 
useful in a scenario similar to the wiretap channel problem where the two parties aim to agree 
on any random string, for example a session key (Alice generates x uniformly at random which 
she sends to Bob, and then they agree on the string f{x)). Here no control on the content of the 
message is required, and the only goal is that at the end of the protocol the two parties agree on 
any random string that is uniform even conditioned on the observations of the intruder. Hence, our 
definition of a wiretap protocol is more stringent than that of resilient functions, since it requires 
the existence and efficient computability of the encoding function E that provides a control over 
the content of the message (see Remark [TJ . 

Another closely related notion is that of all-or-nothing transforms, which was suggested in [39] 
for protection of block ciphers. A randomized polynomial-time computable function /: F™ — > FJ,, 
{m < n), is called a (statistical, non-adaptive, and secret-only) i-AONT with error e if it is efficiently 
invertible and for every S C [n] such that \S\ < t, and all x\,X2 £ F™ we have that the two 
distributions f(xi)\s and f(x2)\s are e-close. An AONT with e = is called perfect. It is easy 
to see that perfectly private wiretap protocols are equivalent to perfect adaptive AONTs. It was 
shown in [T3] that such functions can not exist (with positive, constant rate) when the adversary 
is allowed to observe more than half of the encoded bits. A similar result was obtained in [9] for 
the case of perfect linear RFs. 

As pointed out in |13| . AONTs can be used in the original scenario of Ozarow and Wyner's 
wiretap channel problem. However, the best known constructions of AONTs can achieve rate- 
resilience trade-offs that are far from the information theoretic optimum (see Figure [T]) . While an 
AONT requires indistinguishability of intruder's view for every fixed pair (x±, X2) of messages, the 
relaxed notion of average-case AONT requires the expected distance of f(xi)\s and f(x2)\s to be at 
most e for a uniform random message pair. Hence, for a negligible e, the distance will be negligible 
for all but a negligible fraction of message pairs. Up to a loss in parameters, wiretap protocols are 
equivalent to average case AONTs: 

Lemma 3. Let (E,D) be an encoding/decoding pair for a (t, e, j)2-resilient wiretap protocol. Then 
E is an average-case t-AONT with error at most 2(e + 7). Conversely, an average-case t-AONT 
with error rj 1 can be used as a (t, r/,r/) -resilient wiretap encoder. □ 



A proof of the above lemma is given in Appendix B.2 Note that the converse direction does 
not guarantee zero leakage, hence, zero leakage wiretap protocols are in general stronger than 
average-case AONTs. An average-case to worst-case reduction for AONTs was shown in [S] which, 
combined with the above lemma, shows that any wiretap protocol can be used to construct an 
AONT. 

A simple universal transformation was proposed in [8] to obtain an AONT from any ERF, by 
one-time padding the message with a random string obtained from the ERF. This construction 
can also yield a wiretap protocol with zero leakage. However, it has the drawback of significantly 
weakening the rate-resilience trade-off. Namely, even if an information theoretically optimal ERF 
is used in this reduction, the resulting wiretap protocol will only achieve half the optimal rate (see 
Figure [I]). 

Another concept that is close to our work is that of privacy amplification [2j HJ [331 El] • Similar 
to resilient functions, the goal here is for Alice and Bob to agree on a common secret share, but 
the intruder is given more power. This is compensated by the fact that Alice and Bob have access 
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Figure 1: A comparison of the rate vs. resilience trade-offs achieved by the wiretap protocols for 
the binary alphabet (left) and larger alphabets (right, in this example of size 64). (1) Information 
theoretic bound, attained by Theorem |17| (2) The bound approached by |27| ; (3) Protocol based 
on best nonexp licit binary linear codes |20| 118] ; (4) AONT construction of [5J , assuming that the 
underlying ERF is optimal; (5) Random walk protocol of Corollary [8j (6) Protocol based on the 
best known explicit [37] and non-explicit |20[ HSJ linear codes. 

to a secure channel not available to the intruder over which they can transmit securely some bits. 

The main focus of this paper is on asymptotic trade-offs between the rate R and the resilience 
5 of an asymptotically perfectly private wiretap protocol. Following [36 , it is easy to see that in 
this case, an information-theoretic bound R < 1 — 5 + o(l) must hold. Lower bounds for R in 
terms of 5 have been studied by a number of researchers. For the case of perfect privacy (e = 0), 
Ozarow and Wyner [36 give a construction of a wiretap protocol using linear error-correcting 
codes, and show that the existence of an [n, k, d] q -code implies the existence of a perfectly private, 
{d — 1, 0, 0)q-resilient wiretap protocol of message length n — k and block length n. As a result, 
the Gilbert- Varshamov bound on linear codes |2U| HSJ implies that asymptotically R > 1 — h q (S), 
where h q is the g-ary entropy function (defined in Appendix |A|) . If q > 49 is a square, the bound 
can be further improved to R > 1 — 5 — l/(^/q — 1) using Goppa's AG-codes |21[ Wf\ . In these 
protocols, the encoder can be seen as an adaptively secure, perfect AONTs and the decoder is an 
adaptive perfect RF. Moving away from perfect to asymptotically perfect privacy, it was shown 
in |27] that for any 7 > there exist binary asymptotically perfectly private wiretap protocols with 
R > 1 — 2(5 — 7 an d exponentially small erroij^J This bound strictly improves the coding theoretic 
bound of Ozarow and Wyner for the binary alphabet. 

3 Actually, what is proved in this paper is the existence of t-resilient functions which correspond to decoders in our 
wiretap setting; however, it can be shown that these functions also possess efficient encoders, so that it is possible to 
construct wiretap protocols from them. 
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1.3 Overview of our Results 



In this paper we prove several lower bounds for the rate of asymptotically perfectly private wiretap 
protocols with negligible, i.e., superpolynomially small, error. Our main tool is the design of 
various types of invertible extractors, a concept defined in Section [2j Our first bound, described 
in Section [3j shows that if the alphabet size is d, then there exists G (0, 1) such that for every 
r) > and every constant resilience 5 6 [0, 1), we have rate R > max{ 0:^(1 — <5), 1 — 5/ ad} — r) with 
exponentially small error. This is achieved by suitably modifying the symbol-fixing extractor of 
Kamp and Zuckerman (23]. Contrary to the coding theoretic construction of Ozarow and Wyner, 
for a fixed alphabet size our bound gives a positive rate for every constant resilience 5 G [0, 1). 

Even though the bound in Section [3] is superceded by our main result in Section |4j we have 
included it because of its simplicity and potential for practical use. Our second bound (Theorem 17 ) 
matches the information-theoretic upper bound of Ozarow and Wyner. Namely, for any prime power 
alphabet size q, and any resilience 5 £ [0, 1), we construct a wiretap protocol with superpolynomially 
small error, zero leakage and rate > 1 — 8 — o(l). In fact, this bound holds in a more general setting 
in which the intruder is not only allowed to look at a 5-fraction of the symbols of Alice's message, 
but is also allowed to perform any linear preprocessing of Alice's message before doing so. The 
power of this result stems largely from a black box transformation which makes certain seedless 
extractors invertible. More specifically, the results of this section are obtained by applying this 
transformation to certain affine extractors. 

In sections 5.1 and 5.2 we will demonstrate several important applications of this fact in the 
context of network coding and wiretapped communication in the presence of noise and active 
intruders. In particular we provide, for the first time, an optimal solution to the wiretap problem 
in network coding [7] without imposing any restrictive assumptions. A plot of the bounds can be 
found in Figure [TJ 

The final application in Section 5.3 studies an all-powerful intruder who is only limited by the 
amount of information he can obtain from Alice's encoded message, and not by the nature of the 
observations. By inverting seeded extractors with nearly-optimal output lengths, we will show that 
if Alice and Bob have access to a side channel over which Alice can publicly send a polylogarithmic 
number of bits to Bob, then their communication on the main channel can be made secure even if 
the intruder can access the values of any t Boolean functions of Alice's encoded message. 



2 Inverting Extractors 

In this section we will introduce the notion of invertible extractors and its connection with wiretap 
protocols^} Later we will use this connection to construct wiretap protocols with good rate-resilience 
trade-offs. 

Definition 4. Let £ be a finite alphabet and / be a mapping from S n to S m . For 7 > 0, a function 
A: E m x ¥2 — > S n is called a ^-inverter for / if the following conditions hold: 

4 Another notion of invertible extractors was introduced in [T5] and used in |14| for a different application (entropic 
security) that should not be confused with the one we use. Their notion applies to seeded extractors with long seeds 
that are efficiently invertible bijections for every fixed seed. Such extractors can be seen as a single-step walk on 
highly expanding graphs that mix in one step. This is in a way similar to the multiple-step random walk used in the 
seedless extractor of section [3] that can be regarded as a single-step walk on the expander graph raised to a certain 
power. 
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(a) (Inversion) Given x 6 S m such that / 1 (x) is nonempty, and for every z G F£, we have 
/(A(s,z)) = x. 

(b) (Uniformity) A(U^m,l{ r ) ~ 7 ^/s™- 

A 7-inverter is called efficient if there is a randomized algorithm that runs in worst case polynomial 
time and, given x £ T, m and z as a random seed, computes A(x, z). We call a mapping r y-invertible 
if it has an efficient 7-inverter, and drop the prefix 7 from the notation when it is zero. 

Remark 5. If a function / maps the uniform distribution to a distribution that is e-close to 
uniform (as is the case for all extractors), then any randomized mapping that maps its input x to 
a distribution that is 7-close to the uniform distribution on f~ 1 (x) is easily seen to be an (e + 7)- 
inverter for /. In some situations designing such a function might be easier than directly following 
the above definition. 



The idea of random pre- image sampling was proposed in [13] for construction of adaptive AONTs 
from APRFs. However, they ignored the efficiency of the inversion, as their goal was to show the 
existence of (not necessarily efficient) information-theoretically optimal adaptive AONTs. More- 
over, the strong notion of APRF and a perfectly uniform sampler is necessary for their construction 
of AONTs. As wiretap protocols are weaker than (worst-case) AONTs, they can be constructed 
from slightly imperfect inverters as shown by the following lemma who proof can be found in 



Appendix B.3 



Lemma 6. Let £ be an alphabet of size q > 1 and f: S n — > S m be a (j 2 /2)-invertible q-ary (k,e) 
symbol-fixing extractor. Then, f and its inverter can be seen as a decoder/ encoder pair for an 
(n — k, e + 7, j) q - resilient wiretap protocol with block length n and message length m. □ 



3 A Wiretap Protocol Based on Random Walks 

In this section we describe a wiretap protocol that achieves a rate R within a constant fraction of 
the information theoretically optimal value 1 — 5 (the constant depending on the alphabet size). 



We will use preliminaries from Appendix A. 2 



To achieve our result, we will modify the symbol- fixing extractor of Kamp and Zuckerman [23] 
to make it efficiently invertible without affecting its extraction properties, and then apply Lemma [6] 
above to obtain the desired wiretap protocol. The extractor of [23] starts with a fixed vertex in a 
large expander graph and interprets the input as the description of a walk on the graph. Then it 
outputs the label of the vertex reached at the end of the walk. Notice that a direct approach to 
invert this function will amount to sampling a path of a particular length between a pair of vertices 
in the graph, uniformly among all the possibilities, which might be a difficult problem for good 
families of expander graphs. We work around this problem by choosing the starting point of the 
walk from the inputj^J In particular we show the following: 

Theorem 7. Let G be a constructible d-regular graph with d m vertices and second largest eigenvalue 
Xq > l/y/d. Then there exists an explicit invertible {k,2 s l 2 )d symbol-fixing extractor SFExt: [d] n — > 



The idea of choosing the starting point of the walk from the input sequence has been used before in extractor 
constructions [52], but in the context of seeded extractors for general sources with high entropy. 
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[d] m , such that 

m log d + k log Xq if k < n — m, 

(n — k) log d + (n — m) log A 2 ^ if k > n — m. 

Proof. (Sketch) We first describe the extractor and its inverse. Given an input (v,w) £ [d] m x 
[<i] Tl_m , the function SFExt interprets v as a vertex of G and w as the description of a walk starting 
from v. The output is the index of the vertex reached at the end of the walk. The inverter Inv 
works as follows: Given x £ [d] m , x is interpreted as a vertex of G. Then Inv picks W £ [d] n ~ m 
uniformly at random. Let V be the vertex starting from which the walk described by W ends 
up in x. The inverter outputs (V, W). It is easy to verify that Inv satisfies the properties of a 
O-inverter. The proof that SFExt is an extractor with the given parameters is similar to the original 
proof of Kamp and Zuckerman, but takes our modifications into account. We defer the details to 
Appendix |B.4[ □ 



Combining this with Lemma [6] and setting up the the right asymptotic parameters, we obtain 
our protocol for the wiretap channel problem. 

Corollary 8. Let 8 £ [0, 1) and 7 > be arbitrary constants, and suppose that there is a con- 
structive family of d-regular expander graphs with spectral gap at least 1 — A, for a constant A < 1. 
Moreover suppose that there exists a constant c > 1 such that for every large enough N the family 
contains a graph with at least N and at most cN vertices. Then for every large enough n there is a 
(Sn, 2~^( n \0) ^-resilient wiretap protocol with block length n and rate R = max{a(l— 8), 1— 8 /a}— 7, 
where a := — log d A 2 . 

Proof. (Sketch) For the case c = 1 we use Lemma [6] with the extractor SFExt of Theorem [7] and 
its inverse. Every infinite family of graphs must satisfy A > 2y/d — 1/d [22], and in particular 
we have A > 1/Vd, as required by Theorem [7] We choose the parameters k := (1 — 8)n and 
m := n(max{a(l — 8), 1 — 8 /a} — 7), which gives s = — O(n), and hence, exponentially small error. 
The case c > 1 is similar, but involves technicalities for dealing with lack of graphs of arbitrary size 
in the family. We will elaborate on this in Appendix |B.5 □ 



Using explicit constructions of Ramanujan graphs that achieve A < 2V d —1/d when d — 1 is 
a prime power |31| IM1 157] . one can obtain a > 1 — 2/ logd, which can be made arbitrarily close 
to one (hence, making the protocol arbitrarily close to the optimal bound) by choosing a suitable 
alphabet size that does not depend on n. 

Remark 9. The invertible symbol- fixing extractor above can be used to construct an invertible 
bit-fixing extractor with exponentially small error by the partitioning method introduced in [23 . 
Kamp and Zuckerman used their extractor as an APRF in the one-time pad construction of [8] to 
obtain an adaptively secure statistical AONT. We note that using the invertible bit-fixing extractor 
as an APRF, there will be no need for the one-time pad technique and one can directly obtain an 
adaptive AONT (which is simply the corresponding inverter) with a smaller blow-up compared to 
E3|. 
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4 Invertible Affine Extractors and Asymptotically Optimal Wire- 
tap Protocols 

In this section we will construct a black box transformation for making certain seedless extractors 
invertible. The method is described in detail for affine extractors, and leads to wiretap protocols 
with asymptotically optimal rate-resilience trade-offs. 

A seeded extractor is called linear if it is a linear function for every fixed choice of the seed. 
Trevisan [46 gave a fascinating explicit construction of strong linear extractors based on pseu- 
dorandom generators from hard Boolean functions. His construction was later improved by Raz, 
Reingold and Vadhan [38]. Trevisan 's extractor (and its subsequent improvement) can be viewed 
as a careful puncturing of an F2-linear error-correcting code, where the puncturing depends on the 
choice of the seed. This immediately implies the linearity of the extractor. In this work, we will 
use the following result. 

Theorem 10. }38\j There is an explicit strong linear seeded (k, e)- extractor Ext: FJ> x F 2 — ► Ff 1 
with d = 0(log 3 (n/e)) and m = k- 0(d). □ 

Remark 11. We note that our arguments would identically work for any other linear seeded 
extractor as well, for instance those constructed in [451 14"2] . However, the most crucial parameter in 
our application is the output length of the extractor, being closely related to the rate of the wiretap 
protocols we obtain. Among the constructions we are aware of, the result quoted in Theorem 10 is 
the best in this regard. 

A recent result by Shaltiel jH] gives a general framework for transforming every seedless ex- 
tractor (for a family of sources satisfying a certain closedness condition) with short output length 
to one with an almost optimal output length. The construction uses the imperfect seedless ex- 
tractor to extract a small number of uniform random bits from the source, and will then use the 
resulting sequence as the seed for a seeded extractor to extract more random bits from the source. 
For a suitable choice of the seeded extractor, one can use this construction to extract almost all 
min-entropy of the source. The closedness condition needed for this result to work for a family C 
of sources is that, letting E{x, s) denote the seeded extractor with seed s, for every X £ C and 
every fixed s and y, the distribution {X\E{X , s) = y) belongs to C. If E is a linear function for 
every fixed s (as Trevisan's extractor is), the result will be available for affine sources (since we 
are imposing a linear constraint on an affine source, it remains an affine source). A more precise 
statement of Shaltiel's main result is the following: 

Theorem 12. \J^1\ Let C be a class of distributions on F?> and F: F 2 — > F| be an extractor for 
C with error e. Let E: F 2 x F 2 — > F™ be a function for which C satisfies the closedness condition 
above. Then for every X G C, E(X,F(X)) ~ e2 t+3 E(X,Ut)- □ 

The explicit affine extractor that we will use for this presentation is the following result of 
Bourgain [5]: 

Theorem 13. J2J/ For every constant < 5 < 1, there is an explicit affine extractor AExt : F 2 — ► 
for min-entropy 5n with output length m = fi(n) and error at most 2~^ m \ □ 

Now, having these tools in hand, we are ready to describe our construction of invertible affine 
extractors with nearly optimal output length. 
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Theorem 14. For every constant 5 £ (0,1] and every a £ (0,1), there is an explicit invertible 
affine extractor D : FJ? — ► F™ for min- entropy 5n with output length m = 5n — 0{n a ) and error at 
most 0(2~ nQ/3 ). 

Proof. {Sketch) Let e := 2~™ a/3 , and t := 0(log 3 (n/e)) = 0{n a ) be the seed length required by 



the extractor Ext in Theorem 10 for input length n and error e, and further, let n' := n — t. Set 
up Ext for input length n', min-entropy 5n — t, seed length t and error e. Also set up Bourgain's 
extractor AExt for input length n' and entropy rate 6' , for an arbitrary constant 5' < 5. Then the 
function F will view the ro-bit input sequence as a tuple (s,x), s £ F| and i G , and outputs 
Ext(s, s + AExt(x)| [t ]). 

First we show that this is an affine extractor. Suppose that (S, X) £ F f 2 x Fj is a random 
variable sampled from an affine distribution with min-entropy 5n. The variable S can have an 
affine dependency on X. Hence, for every fixed s 6 F|, the distribution of X conditioned on the 
event S = s is affine with min-entropy at least 5n—t, which is at least 5'n' for large enough n. Hence 



AExt(X) will be 2 ^( n )-close to uniform by Theorem 13 This implies that AExt(X)|w + S can 
extract t random bits from the affine source with error 2~^ n \ Combining this with Theorem 12 
noticing the fact that the class of affine extractors is closed with respect to linear seeded extractors, 
we conclude that D is an affine extractor with error at most e + 2~ n ^ ■ 2*+ 3 = 0(2-" a/3 ). 

Now the inverter works as follows: Given y £ F™, first it picks Z £ F' 2 uniformly at random. 
The seeded extractor Ext, given the seed Z is a linear function Extz- FJJ — ► F™. Without loss 
of generality, assume that this function is surjectiv^j Then the inverter picks X £ F% uniformly 
at random from the affine subspace defined by the linear constraint Extz(X) = y, and outputs 
(Z + AExt(X)|[f],X). It is easy to verify that the output is indeed a valid preimage of y. To see 
the uniformity of the inverter, note that if y is chosen uniformly at random, the distribution of 
(Z, X) will be uniform on F£ . Hence {Z + AExt(X) | m , X) , which is the output of the inverter, will 
be uniform. □ 

Remark 15. In the above construction we are using Bourgain's extractor as a black box and hence, 
it can be replaced by an arbitrary affine extractor working for an arbitrary field size (however, 
depending on the particular affine extractor being used, we may need to adapt the parameters 
given above accordingly). In particular, over large fields one can use the affine extractor given by 
Gabizon and Raz [19 j that works for sub-constant entropy rates as welQ 

Remark 16. Bourgain presented his affine extractor for the most challenging underlying field, 
namely, the binary field. However, his argument can be adapted to work for larger fields as well [6]. 
Hence, one can obtain invertible affine extractors for arbitrary finite fields. 

An affine extractor is in particular, a symbol-fixing extractor. Hence the above theorem, com- 
bined with Lemma [6] gives us a wiretap protocol with almost optimal parameters: 

Theorem 17. Let 5 £ [0,1) and a £ (0,1/3) be constants. Then for a prime power q > 1 and 
every large enough n there is a (5n,O(2~ na ),0) q -resilient wiretap protocol with block length n and 
rate l-<5-o(l). □ 



6 Because the seeded extractor is strong and linear, for most choices of the seed it is a good extractor, and hence 
necessarily surjective. Hence if Ext is not surjective for some seed z, one can replace it by a trivial surjective linear 
mapping without affecting its extraction properties. 

7 Using this extractor, we can achieve the best parameters by combining it with the seeded extractor for affine 
extractors over large fields that is constructed in the same paper. 
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5 Further Applications 



In this section we will sketch some important applications of our technique to more general wiretap 
problems. 



5.1 Noisy Channels and Active Intruders 

Suppose that Alice wants to transmit a particular sequence to Bob through a noisy channel. She 
can use various techniques from coding theory to encode her information and protect it against 
noise. Now what if there is an intruder who can partially observe the transmitted sequence and 
even manipulate it? Modification of the sequence by the intruder can be regarded in the same way 
as the channel noise; thus one gets security against active intrusion as a bonus by constructing a 
code that is resilient against noise and passive eavesdropping. There are two natural and modular 
approaches to construct such a code. A possible attempt would be to first encode the message 
using a good error-correcting code and then applying a wiretap encoder to protect the encoded 
sequence against the wiretapper. However, this will not necessarily keep the information protected 
against the channel noise, as the combination of the wiretap encoder and decoder does not have to 
be resistant to noise. Another attempt is to first use a wiretap encoder and then apply an error- 
correcting code on the resulting sequence. Here it is not necessarily the case that the information 
will be kept secure against intrusion anymore, as the wiretapper now gets to observe the bits from 
the channel-encoded sequence that may reveal information about the original sequence. However, 



the wiretap protocol given in Theorem 17 is constructed from an invertible affine extractor, and 
guarantees resiliency even if the intruder is allowed to observe arbitrary linear combinations of 
the transmitted sequence. Hence, we can use the second approach with our protocol and still 
ensure privacy against an active intruder, provided that the error-correcting code is linear. This 
immediately gives us the following result: 

Theorem 18. Suppose that there is a q-ary linear error-correcting code with rate r that is able to 
correct up to a r fraction of errors (via unique or list decoding). Then for every constant 5 E [0, 1) 
and a £ (0,1/3) and large enough n, there is a (5n,O(2~ na ),0) q -resilient wiretap protocol with 
block length n and rate r — 5 — o(l) that can also correct up to a t fraction of errors. □ 



The same idea can be used to protect fountain codes, e.g., LT- [32] and Raptor Codes 
against wiretappers without affecting the error correction capabilities of the code. Obviously this 
simple composition idea can be used for any type of channel so long as the inner code is linear, at 
the cost of reducing the total rate by almost 5. Hence, if the inner code achieves the capacity of 
the direct channel (in the absence of the wiretapper), the composed code will achieve the capacity 
of the wiretapped channel, which is less than the original capacity by 5 |10| . 



5.2 Network Coding 

Our wiretap protocol from invertible affine extractors is also applicable in the more general setting 
of transmission over networks. A communication network can be modeled as a directed graph, in 
which nodes represent the network devices and information is transmitted along the edges. One 
particular node is identified as the source and m nodes are identified as receivers. The main problem 
in network coding is to have the source reliably transmit information to the receivers at the highest 
possible rate, while allowing the intermediate nodes arbitrarily process the information along the 
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way. Suppose that the min-cut from the source to each receiver is n. It was shown in [T] that the 
source can transmit information up to rate n to all receivers, and in |28| |2"5] that linear network 
coding is in fact sufficient to achieve this rate (See [51] for a comprehensive account of these and 
other relevant results). 

Designing wiretap protocols for networks is an important question in network coding, which was 
first posed by Cai and Yeung [7J. In this problem, an intruder can choose a bounded number, say 
t, of the edges and eavesdrop all the packets going through those edges. They designed a network 
code that could provide the optimal multicast rate of n — t with perfect privacy. However this 
code requires an alphabet size of order ('"?'), where E is the set of edges. Their result was later 
improved in [16] who showed that a random linear coding scheme can provide privacy with a much 
smaller alphabet size if one is willing to achieve a slightly sub-optimal rate. Namely, they obtain 
rate n — t(l + e) with an alphabet of size roughly 0(|£'| 1 / <E ), and show that achieving the exact 
optimal rate is not possible with small alphabet size. El Rouayheb and Soljanin [15] suggested 
to use the original code of Ozarow and Wyner [36 as an outer code at the source and showed 
that a careful choice of the network code can provide optimal rate with perfect privacy. However, 
their code eventually needs an alphabet of size at least ('^i^ 1 ) + m. Building upon this work, 
Silva and Kschischang [26J constructed an outer code that provides similar results while leaving 
the underlying network code unchanged. However, their protocol imposes the assumption that the 
packets in the network are transmitted in bundles of length at least n, (hence, one needs to have 
an estimate of the min-cut size of the network) , which bounds the intruder to observe the same set 
of links within each bundle, practically enlarging the field size from q to at least q n . 



Using the protocol given in Theorem 17 as an outer-code in the source node, one can construct 
an asymptotically optimal wiretap protocol for networks that is completely unaware of the network 
and eliminates all the restrictions in the above results. Hence, extending our notion of (t, e, 7)5- 
resilient wiretap protocols naturally to communication networks, we obtain the following: 

Theorem 19. Let 5 £ [0,1) and a £ (0,1/3) be constants, and consider a network that uses a 
linear coding scheme over a finite field ¥ q for reliably transmitting information at rate n, for n 
large enough^ Then the source and the receiver nodes can use an outer code of rate 1 — 5 — o(l) 
which is completely independent of the network, leaves the network code unchanged, and provides 
resilience 5 with error 0(2^™°) and zero leakage over a q-ary alphabet. □ 



5.3 Arbitrary Processing 

In this section we consider the erasure wiretap problem in its most general setting, which is still 
of practical importance. Suppose that the information emitted by the source goes through an 
arbitrary communication medium and is arbitrarily processed on the way to provide protection 
against noise, to obtain better throughput, or for other reasons. Now consider an intruder who 
is able to eavesdrop a bounded amount of information at various points of the channel. One can 
model this scenario in the same way as the original point-to-point wiretap channel problem, with 
the difference that instead of observing t arbitrarily chosen bits, the intruder now gets to choose 
an arbitrary circuit C with t output bits (which captures the accumulation of all the intermediate 

8 We need n large enough depending on the asymptotics of the affine extractor. However, if n is not large enough 
one can use the wiretap outer code on bundles of rn packets at a time, for a sufficiently large r. This will not impose 
any restriction on the links observed by the intruder and is not equal to enlarging the alphabet size. 



12 



processing) and observes the output of the circuit when applied to the transmitted sequenc^J 
Obviously there is no way to guarantee resiliency in this setting, since the intruder can simply 
choose C to compute t output bits of the wiretap decoder. However, suppose that in addition there 
is an auxiliary communication channel between the source and the receiver (that we call the side 
channel) that is separated from the main channel, and hence, the information passed through the 
two channel do not blend together by intermediate processings. 

We call this scenario the general wiretap problem, and extend our notion of (t, e, 7)-resilient 
protocol to this problem, with the slight modification that now the output of the encoder (and the 
input of the decoder) is a pair of strings (1/1,2/2) £ F2 x w h ere Ui (2/2) is sent through the main 
(side) channel. Now we call n + d the block length and let the intruder choose an arbitrary pair of 
circuits (Ci,C2), one for each channel, that output a total of t bits and observe (Ci (2/1), £2(2/2))- 

The information-theoretic upper bounds for the achievable rates in the original wiretap problem 
obviously extend to the general wiretap problem as well. Below we show that for the general prob- 
lem, secure transmission is possible at asymptotically optimal rates even if the intruder intercepts 
the entire communication passing through the side channel. 

Similar as before, our idea is to use invertible extractors to construct general wiretap protocols, 
but this time we use invertible strong seeded extractors. Strong seeded extractors were used in [S] 
to construct ERFs, and this is exactly what we use as the decoder in our protocol. As the encoder 
we will use the corresponding inverter, which outputs a pair of strings, one for the extractor's input 
which is sent through the main channel and another as the seed which is sent through the side 



channel. Hence we will obtain the following result, which is proved in Appendix B.6 



Theorem 20. Let 8 € [0,1) be a constant. Then for every a,e > 0, there is a (5n,e,2~ an + e)- 
resilient wiretap protocol for the general wiretap channel problem that needs to send n bits through 
the main channel and d := 0(log 3 (n/e)) bits through the side channel and achieves rate 1 — 5 — a — 
0{d/{n + d)). The protocol is secure even when the entire communication through the side channel 
is observable by the intruder. □ 



The above theorem uses the linear seeded extractor of Theorem 10, to achieve almost perfect 
resiliency and approaches the optimal rate using only 0(log 3 n) bits of side communication. It is 
possible to drop this amount down to O(logra) bits by inverting existing constructions of seeded 
extractors that work with seeds of length 0(log(n/e)) and extract almost the whole min-entropy 
(e.g., [30, E21 E2]). However, we defer this improvement to subsequent work. 

We observe that it is not possible to guarantee zero leakage for the general wiretap problem 
above. Specifically, suppose that (C\,C2) are chosen in a way that they have a single preimage for a 
particular output (w\,W2). With nonzero probability the observation of the intruder may turn out 
to be (w\,W2), in which case the entire message is revealed. Nevertheless, it is possible to guarantee 
negligible leakage as the above theorem does. Moreover, when the general protocol above is used 
for the original wiretap II problem (where there is no intermediate processing involved), there is 
no need for a separate side channel and the entire encoding can be transmitted through a single 



channel. Contrary to Theorem 17 however, the general protocol will not guarantee zero leakage 
even for this special case. 



In fact this models a harder problem, as in our problem the circuit C is given by the communication scheme and 
not the intruder. Nevertheless, we solve the harder problem. 
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A Preliminaries and Basic Facts 

For a prime power q, we use ¥ q to denote the finite field with q elements. We will occasionally use 
the notation F2 for the set {0, 1}, even if we do not need to use the field structure. For a positive 
integer n, define [n\ as the set {1, 2, ... , n}. For a vector x = (x±, X2, ■ ■ ■ , x n ) and a subset S C [n], 
we denote by x\s the vector of length |5| that is obtained from x by removing all the coordinates 
Xi, i ^ S. For an integer k > 0, we will use the notation Uk for the uniform distribution on Fg. 
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More generally, for a finite set Q, we will use for the uniform distribution on fi. For a function 
/, we denote by f~ 1 (x) the set of the preimages of x, i.e., the set {y: f(y) = x}. 

We denote the probability measure defined by a distribution X by Pr^, hence, Pr^(x) and 
Pr^fS 1 ] for x G VL and S C S7 denote the probability that X assigns to an outcome x and an event 
S, respectively. We will use X\S to denote the conditional distribution of X restricted to the set 
(event) S, and X ~ X to denote that a random variable X is distributed according to X. 

Definition 21. The support of a distribution is the set of all the elements of the sample space to 
which it assigns nonzero probabilities. The min- entropy of a distribution X with finite support S 
is defined as 



where typically log(-) is the logarithm function in base 2. However, when X is supported on a set 
of d-axy strings, we find it more convenient to use the logarithm function in base d and measure the 
entropy in <i-ary symbols instead of bits. The Shannon entropy of the distribution, on the other 
hand, is defined as 



When a distribution defined on the set of n-bit strings has min-entropy k, the quantity k/n 
defines the entropy rate of the distribution. Note that the above definition immediately implies 
that the min-entropy of a distribution is upper bounded by its Shannon entropy (which is in fact 
the expectation of the logarithm of the probabilities). Hence, if the min-entropy of a distribution is 
at least h, its Shannon-entropy is also at least h. These two measures however coincide for uniform 
distributions. 

Definition 22. The statistical distance (or total variation distance) of two distributions X and y 
defined on the same finite space S is given by 



and is denoted by dist(^V, 3^) . Note that this is half the i\ distance of the two distributions when 
regarded as vectors of probabilities over S. 

It can be shown that the statistical distance of the two distributions is at most e if and only if 
for every T C S, we have | Pr^[T] — Pry[T]| < e. When the statistical distance of X and y is at 
most e, we denote it by X ~ e V- 

While we defined the above terms for probability distributions, with a slight abuse of notation 
we may use them interchangeably for random variables as well. 

The following proposition quantifies the Shannon entropy of a distribution that is close to 
uniform: 

Proposition 23. Let X be a probability distribution on a finite set S, \S\ > 4, that is e-close to 
the uniform distribution on S, for some e < 1/4. Then H{X) > log |5|(1 — e) 

Proof. Let n := \S\, and let f(x) := — xlogx. The function f(x) is concave, passes through 
the origin and is strictly increasing in the range [0,1/e]. From the definition, we have H(X) = 



Hoo(X) := min{-logPr(x)}, 




(— Pr(x) logPr(x 
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X^es f(Prx{s))- For each term s in this summation, the probability that X assigns to s is either 
at least 1/n, which makes the corresponding term at least log n/n (due to the particular range of 
| SI and e), or is equal to 1/n — e s , for some e s > 0, in which case the term corresponding to s is 
less than log n/n by at most e s log n (this follows by observing that the slope of the line connecting 
the origin to the point (1/n, /(1/n)) is logn). The bound on the statistical distance implies that 
the differences e s add up to at most e. Hence, the Shannon entropy of X can be less than log n by 
at most elogn. □ 

The following (easy to verify) proposition shows that any function maps close distributions to 
close distributions: 

Proposition 24. Let 0, and T be finite sets and f be a function from 0, toT. Suppose that X and 
y are probability distributions on 0, and T, respectively, and let X' be a probability distribution on 
$7 which is S-close to X. Then if f(X) ~ e y, then f(X') ~ e +5 y. □ 

We will use the following proposition in our construction of wiretap protocols from invertible 
extractors in Section [2j 

The q-&ry entropy function h q (that is used in the introduction) is defined as 

h q (x) := xlog q (q - 1) - xlog q (x) - (1 - x)log 9 (l - x). 
A.l Preliminaries on Extractors 

Now we are ready to define the basic notions related to randomness extractors that we will use. 
We refer the reader to jlQ] for a more detailed account of these notions. 

Definition 25. Let S be a finite set of size d > 1. An (n, k)d family of g-ary randomness sources 
of length n and min-entropy k is a set T of probability distributions on E n such that every X £ T 
has d-ary min-entropy at least k . 

There are numerous natural families of sources that have been introduced and studied in the 
theory of randomness extractors. In this work, besides the general family of distributions with high 
min-entropy, we will focus on the family of symbol-fixing and affine sources, defined below. 

Definition 26. An (n, k)d symbol-fixing source is the distribution of a random variable X = 
(Xi,X2, ■ ■ ■ ,X n ) € T, n , for some set E of size d, in which at least k of the coordinates (chosen 
arbitrarily) are uniformly and independently distributed on £ and the rest take deterministic values. 

When d = 2, we will have a binary symbol-fixing source, or simply a bit-fixing source. In this 
case S = {0, 1}, and the subscript d is dropped from the notation. 

Definition 27. For a prime power q, the family of q-avy k- dimensional affine sources of length 
n is the set of distributions on F™, each uniformly distributed on an affine translation of some 
/c-dimensional sub-space of F™. 

Affine sources are natural generalizations of symbol-fixing sources when the alphabet size is a 
prime power. It is easy to see that the g-ary min-entropy of a A;- dimensional affine source is k. 

Definition 28. A function F£ x FJ> — > F™ is a strong seeded (k, e)-extractor if for every 
distribution X on F?J with min-entropy at least k, random variable X ~ X and a seed Y ~ Ud, 
the distribution of {E{X, Y),Y) is e-close to U m +d- An extractor is explicit if it is polynomial-time 
computable. 
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A strong extractor E(x, y) for a source X with error e satisfies the property that for all but an 
e fraction of the choices of the seed y, the distribution of E(X, y) is e-close to uniform. 

For more restricted sources (in particular, symbol-fixing and affine sources), seedless (or deter- 
ministic) extraction is possible. 

Definition 29. Let E be a finite alphabet of size d > 1. A function E: E n — > E m is a (seedless) 
(fc, e)d-extractor for a family F of (n, k)d sources (defined on E n ) if for every distribution X £ JF 
with min-entropy at least fc, the distribution E(X) is e-close to We™. A seedless extractor is explicit 
if it is polynomial-time constructible. 

A. 2 Preliminaries on Expander Graphs 

For the wiretap protocol given in Section [3] we need essentially the same tools used for the symbol- 
fixing extractor construction of [23], that we briefly review here. 

We will always work with directed regular expander graphs that are obtained from undirected 
graphs by replacing each undirected edge with two directed edges in opposite directions. Let 
G = (V,E) be a (i-regular graph. Then a labeling of the edges of G is a function L : V x [d] — > V 
such that for every u S V and t S [d], (u, L(u,t)) is in E. The labeling is consistent if whenever 
L(u, t) = L(v, t), then u = v. Note that the natural labeling of a Cayley graph is in fact consistent. 
We call a family of expander graphs constructible if all the graphs in the family have a consistent 
labeling that is efficiently computable. 

Let A denote the normalized adjacency matrix of a d-regular graph G. We denote by \g the 
second largest eigenvalue of A in absolute value. The spectral gap of G is given by 1 — Ac- Starting 
from a probability distribution p on the set of vertices, represented as a real vector, performing 
a single-step random walk on G leads to the distribution defined by pA. The following is a well 
known lemma on the convergence of the distributions resulting from random walks (cf. [29] for a 
proof): 

Lemma 30. Let G = (V, E) be a d-regular undirected graph, and A be its normalized adjacency 
matrix. Then for any probability vector p, we have \\pA — Uy\\2 < -^G Hp — MvlU, where 1 1 - 1 1 s denotes 
the £2 norm. □ 

B Proofs 

B. l Proof that Definition [2] Solves Wiretap II Problem 

Suppose that (E,D) is an encoder/decoder pair as we modeled in Definition |2j Let W := Y\s be 
the intruder's observation, and denote by W the set of good observations, namely, 

W' := {w e E': dist(^, U^m) < e}. 
Let -ff(-) denote the Shannon entropy in <i-ary symbols. Then we will have 

H{X\W) = Pv(W = w)H{X\W = w)> Pv ( w = w)H{X\W = w) 

(a) „ (b) 

> 2^ Pr(W = w)(l- e)m > (1 - 7)(1 - e)m > (1 - 7 - e)m. 

weW 
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The inequality (a) follows from the definition of W combined with Proposition 23 and (b) by the 
definition of leakage parameter. 



B.2 Proof of Lemma [3] 

In this appendix we show that, up to a loss in error parameters, the notion of wiretap protocols 
in Definition [2] is equivalent to the notion of average-case AONTs. This easily follows from the 
following Proposition: 

Proposition 31. Let (X, Y) be a pair of random variables jointly distributed on a finite set $7 x V . 



Ther 10 E Y [d\st(X\Y,X)] =E x [d\st(Y\X,Y)). 



Proof. For x S O and y £ T, we will use shorthands p x ,P y ,Pxy to denote Pr[X = x],Pv[Y = 
y] , Pi[X = x, Y = y], respectively. Then we have 

E Y [6\st(x\Y,x)} = Y,Py d ^( x \( Y = y)> x ) = lY<PyY<\Pxy/Py-Px\ 

yer yer xen 

= 2 X) X] \ px y ~ P*Py\ = \j2 Px Yl \p*v/p* ~ Pv\ 

= ^ Px d\st(Y\(X = x),Y) = E x [d\st(Y\X,Y)]. 

□ 

Now, consider a (t, e, 7) 9 -resihent wiretap protocol as in Definition |2j and accordingly, let the 
random variable Y = E(X, R) denote the encoding of X with a random seed R. For a set S C [n] of 
size at most t, denote by W := Y\$ the intruder's observation. The resiliency condition implies that, 
the set of bad observations B$ has a probability mass of at most 7 and hence, the expected distance 



dist(X|VF, X) taken over the distribution of W is at most e + 7. Now we can apply Proposition 31 
above to the jointly distributed pair of random variables (W, X), and conclude that the expected 
distance dist(iy \X, W) over the distribution of X (which is uniform) is at most e + 7. This implies 
that the encoder is an average-case t-AONT with error at most 2(e + 7). Conversely, the same 
argument combined with Markov's bound shows that an average-case t-AONT with error rf can 
be seen as (t, rj, r/)-resilient wiretap protocol. 

B.3 Proof of Lemma [6] 

Let E and D denote the wiretap encoder and decoder, respectively. Hence, E is the (7 2 /2)-inverter 
for /, and D is the extractor / itself. From the definition of the inverter, for every x £ T, m and 
every random seed r, we have D(E(x,r)) = x. Hence it is sufficient to show that the pair satisfies 
the resiliency condition. We will use the following Proposition in our proof: 



10 Here we are abusing the notation and denote by Y the marginal distribution of the random variable Y, and by 
Y\(X — a) the distribution of the random variable Y conditioned on the event X = a. 
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Proposition 32. Let £1 be a finite set that is partitioned into subsets Si, ■ ■ ■ , Sk and suppose that X 
is a distribution on Q that is 7 -close to uniform. Denote by pi, i = 1, . . . k, the probability assigned 
to the event Si by X. Then 

J2Pi-d\st(X\S i ,U Si )<2 1 . 



Proof. Let N := and define for each i, 7, := Ylses- \^ >T x(s) — | , so that 71 + • • • + 7fc < 27. 
Observe that by triangle's inequality, for every i we must have \pt — \Si\/N\ < 7$. To conclude the 
claim, it is enough to show that for every i, we have distf^ISi, Usi) — li/Pi- This is shown in the 
following. 



Pl -6\st{X\S i Ms i ) 
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2 2 5, 



|5i|7j = ji. 



□ 



Now, let the random variable X be uniformly distributed on T, k and the seed R G FJJ be chosen 
uniformly at random. Denote the encoding of X by Y := E(X, R). Fix any S C [n] of size at most 
n — k. 

For every w G Si 5 !, let 1^, denote the set {y G S n : y|g = Note that the sets Y w partition 
the space S" into \Y,\ S \\ disjoint sets. 

Let y and 3^5 denote the distribution of Y and Y\s, respectively. The inverter guarantees that 
y is (7 2 /2)-close to uniform. Applying Proposition 32 we get that 

2 Pr[E| 5 = w] ■ d\st(y\Y w ,U Ya ) < 7 2 . 

The left hand side is the expectation of d\st(y\Y w ,UY w ). Denote by W the set of all bad outcomes 
of Y\s, i.e., W := {w G S' 5 ' | dist(3 ; |E w ,Z^y m ) > 7}. By Markov's inequality, we conclude that 

Pr[E| 5 G W] < 7. 

For every to G W, the distribution of Y conditioned on the event Y\$ = w is 7-close to a symbol- 
fixing source with n — \S\ > k random symbols. The fact that D is a symbol-fixing extractor 
for this entropy and Proposition 24 imply that, for any such w, the conditional distribution of 
.D(y)|(E|s = w) is (7 + e)-close to uniform. Hence with probability at least 1 — 7 the distribution 
of X conditioned on the outcome of Y\$ is (7 + e)-close to uniform. This ensures the resiliency of 
the protocol. 
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B.4 Omitted Details of the Proof of Theorem [7j 



In this section we show that SFExt, as defined in Theorem [7j is a symbol-fixing extractor. Let 
(x,w) £ [d] m x [d] n ~ m be a vector sampled from an (n,k)d symbol-fixing source, and let u := 
SFExt(x, w). Recall that u can be seen as the vertex of G reached at the end of the walk described 
by w starting from x. Let pi denote the probability vector corresponding to the walk right after 
the ith step, for % = 0, . . . , n — m, and denote by p the uniform probability vector on the vertices 
of G. Our goal is to bound the error e of the extractor, which is half the t\ norm of p n -m — P- 

Suppose that x contains k\ random symbols and the remaining k% := k — ki random symbols 
are in w. Then po has the value d~ kl at d kl of the coordinates and zeros elsewhere, hence 

[Ipo-pII! = d kl (d~ kl - d~ m ) 2 + (d m - d kl )d~ 2m = d~ kl - d~ m < d~ kl . 

Now for each i G [n — m) , if the ith step of the walk corresponds to a random symbol in w the 



£2 distance is multiplied by Xq by Lemma 30 Otherwise the distance remains the same due to the 
fact that the labeling of G is consistent. Hence wg obtain ||Pri— m — 

p\\l < d~ kl \ 2k2 . Translating 
this into the l\ norm by using the Cauchy-Schwarz inequality, we obtain e, namely, 

< l ( ^( m - fc l)/ 2 / \ fc 2 < 2(( m_fc i) logd+fe lo § a g)/ 2 

~ 2 G 

By our assumption, Xq > \j\fd. Hence, everything but k\ and &2 being fixed, the above bound 
is maximized when k\ is minimized. When k < n — m, this corresponds to the case k\ = 0, and 
otherwise to the case k\ = k — n + m. This gives us the desired upper bound on e. □ 



B.5 Omitted Details of the Proof of Corollary [8] 

We prove Corollary [8] for the case c > 1. The construction is similar to the case c = 1, and in 
particular the choice of m and k will remain the same. However, a subtle complication is that 
the expander family may not have a graph with d m vertices and we need to adapt the extractor 
of Theorem [7] to support our parameters, still with exponentially small error. To do so, we pick 
a graph G in the family with N vertices, such that c^d" 1 < N < c^^d™ 1 , for a small absolute 
constant ij > that we are free to choose. The assumption on the expander family guarantees that 
such a graph exists. Let m! be the smallest integer such that d m > c^N. Index the vertices of 
G by integers in [N]. Note that m' will be larger than m by a constant multiplicative factor that 
approaches 1 as r\ — > 0. 

For positive integers q,p < q, define the function Mod^^: [q] — ► [p] by Mod (?iP (x) := l+(xmodp). 
The extractor SFExt interprets the first m! symbols of the input as an integer u, < u < d m 
and performs a walk on G starting from the vertex Mod^' N (u + 1), the walk being defined by 
the remaining input symbols. If the walk reaches a vertex v at the end, the extractor outputs 
Mod7v j( i m (f ) — 1, encoded as a d-ary string of length m. A similar argument as in Theorem [7] can 
show that with our choice of the parameters, the extractor has an exponentially small error, where 
the error exponent is now inferior to that of Theorem [7] by 0(m), but the constant behind O(-) 
can be made arbitrarily small by choosing a sufficiently small rj. 

The real difficulty lies with the inverter because Mod is not a balanced function (that is, all 
images do not have the same number of preimages), thus we will not be able to obtain a perfect 
inverter. Nevertheless, it is possible to construct an inverter with a close-to-uniform output in 
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norm. This turns out to be as good as having a perfect inverter, and thanks to the following lemma, 
we will still be able to use it to construct a wiretap protocol with zero leakage: 

Lemma 33. Suppose that f : [d] n — * [d] m is a (k,2^^ m ^)d symbol-fixing extractor and that X 
is a distribution on [d] n such that \\X — Wu] n l|oo 

< 2- n ^/d n . Denote by X' the distribution X 
conditioned on any fixing of at most n — k coordinates. Then f(X') ~ 2 -n(m) U\d\ m - 



Proof. By Proposition 24 it suffices to show that X' is 2 -f ^ m )-close to an (n, k)d symbol-fixing 
source. Let S C [d} m denote the support of X', and let e/d n be the distance between X and U\d\ n > 

so that by our assumption, e = 2~^ m \ By the bound on the ioo distance, we know that Prx(S) 
1 si 1 si 

is between ^(1 — e) and ^(1 + e). Hence for any x G S, Pr^/(x), which is Pr^(x)/ Pr^(5), is 
between rk • |=f and rapi^f- This differs from 1/\S\ by at most 0(e)/\S\. Hence, X' is 2- n ( m )-close 
to U s . □ 

In order to invert our new construction, we will need to construct an inverter lnv giP for the 
function Mod^p. For that, given x G [p] we will just sample uniformly in its preimages. This is 
where the non-balancedness of Mod causes problems, since if p does not divide q the distribution 
lnv gi p(W[ p ]) is not uniform on [q\. 

Lemma 34. Suppose that q > p. Given a distribution X on \p] such that \\X — ti^W^ < ~, we 
have Hlnv^AO-Z^Hoo < \ ■ ^ ■ 

Proof. Let X ~ X and Y ~ \nv q ^ p {X). Since we invert the modulo function by taking for a given 
output a random preimage uniformly, Pr[Y = y] is equal to Pr[X = Mod q ^ p (y)] divided by the 
number of y with the same value for Modg iP (y). The latter number is either [q/p\ or \q/p~\ , so 

' " ' < Pr(y = y)< ' ' ' 



p\q/p\ p[q/p\ 
Bounding the floor and ceiling functions by q/p ± 1, we obtain 

^—^ < Pv(Y = y)< 
q+p q—p 

That is 

- P ~ eq <Pr(Y = y)- 1 -< P + eq 



q{q + p) q q{q-p) 

which concludes the proof since this is true for all y. □ 

Now we describe the inverter lnv(:c) for the extractor, again abusing the notation. First the 
inverter calls \r\VN )( i m {x) to obtain x\ G [N]. Then it performs a random walk on the graph, starting 
from xi, to reach a vertex xi at the end which is inverted to obtain £3 = lnv^ m ' ^(#2) ^ a d-avy 
string of length m! . Finally, the inverter outputs y = (xz,w), where w corresponds the inverse of 
the random walk of length n — m! . It is obvious that this procedure yields a valid preimage of x. 

Using the previous lemma, if x is chosen uniformly, x\ will be at ^-distance e\ := ^ ■ N d _ dm = 
jjO^^ 171 ). For a given walk, the distribution of X2 will just be a permutation of the distribution of 
xi and applying the lemma again, we see that the ^oo-distance of X3 from the uniform distribution is 
e 2 : = 1 N + e f ld ™' = -^Oic-v" 1 ). This is true for all the d n ~ m ' possible walks so the ^-distance of 

the distribution of y from uniform is bounded by ^0(c~ vm ). Applying Lemma 33 in an argument 
similar to Lemma [6] concludes the proof. 
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B.6 Proof of Theorem [20] 



We will need the following Proposition in our proof, which is easy to verify: 

Proposition 35. Let f : — » F2™ be a Boolean function. Then for every a > 0, and X ~ U n , the 
probability that f(X) has fewer than 2 n ( 1 ~ <5 ~ a ) preimages is at most 2~ an . □ 



Let Ext be the linear seeded extractor of Theorem 10 set up for input length n, seed length 
d = 0(log 3 (n/e)), min-entropy n(l — 5 — a), and output length m = n(l — 5 — a) — O(d). Then 
the encoder chooses a seed Z for the extractor uniformly at random and sends it through the side 
channel. For the chosen value of Z, the extractor is a linear function, and as before, given a message 
x £ F™, the encoder picks a random vector in the affme subspace that is mapped by this linear 
function to x and sends it through the public channel. Then the decoder applies the extractor to 
the seed received from the secure channel and the transmitted string. The resiliency of the protocol 



can be shown in a similar manner as in Theorem 14 Specifically, note that by Proposition 35 with 



probability at least 1 — 2~ an , the string transmitted through the main channel, conditioned on the 
observation of the intruder from the main channel, has a distribution y with min-entropy at least 
n(l — 5 — a). Now in addition suppose that the seed z is entirely revealed to the intruder. As the 
extractor is strong, with probability at least 1 — e, z is a good seed for y, meaning that the output 
of the extractor applied to y and seed z is e-close to uniform, and hence the view of the intruder 
on the original message remains e-close to uniform. 
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